Two-Pattern Strings — Computing Repetitions & Near-Repetitions

نویسندگان

  • Frantisek Franek
  • Weilin Lu
  • W. F. Smyth
چکیده

In a recent paper we introduced infinite two-pattern strings on the alphabet {a, b} as a generalization of Sturmian strings, and we posed three questions about them: • Given a finite string x, can we in linear time O(|x|) recognize whether or not x is a prefix/substring of some infinite two-pattern string? • If recognized as two-pattern, can all the repetitions in x be computed in linear time? • Given an integer ℓ, how many of these " two-pattern " strings x of length ℓ are there? In the previous paper we were able to answer the first of these questions in the affirmative, at least for " complete " two-pattern strings x. Here we show that, once a complete two-pattern string x has 1 been recognized, its repetitions can all be computed in linear time using an iterative algorithm that in addition computes all the " near-repetitions " in x. The third question is dealt with in a subsequent paper.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Two-pattern strings II - frequency of occurrence and substring complexity

The two previous papers in this series introduced a class of infinite binary strings, called two-pattern strings, that constitute a significant generalization of, and include, the much-studied Sturmian strings. The class of two-pattern strings is a union of a sequence of increasing (with respect to inclusion) subclasses TPλ of two-pattern strings of scope λ, λ = 1, 2, · · · . Prefixes of two-pa...

متن کامل

Crochemore's Repetitions Algorithm Revisited - Computing Runs

Crochemore’s repetitions algorithm introduced in 1981 was the first O(n logn) algorithm for computing repetitions. Since then, several linear-time worst-case algorithms for computing runs have been introduced. They all follow a similar strategy: first compute the suffix tree or array, then use the suffix tree or array to compute the Lempel-Ziv factorization, then using the Lempel-Ziv factorizat...

متن کامل

Lossless filter for multiple repetitions with Hamming distance

Similarity search in texts, notably in biological sequences, has received substantial attention in the last few years. Numerous filtration and indexing techniques have been created in order to speed up the solution of the problem. However, previous filters were made for speeding up pattern matching, or for finding repetitions between two strings or occurring twice in the same string. In this pa...

متن کامل

Repetitions in strings: Algorithms and combinatorics

The article is an overview of basic issues related to repetitions in strings, concentrating on algorithmic and combinatorial aspects. This area is important both from theoretical and practical point of view. Repetitions are highly periodic factors (substrings) in strings and are related to periodicities, regularities, and compression. The repetitive structure of strings leads to higher compress...

متن کامل

Computing Periodicities in Strings — A New Approach

The most efficient methods currently available for the computation of repetitions or repeats in a string x = x[1..n] all depend on the prior computation of a suffix tree/array STx/SAx. Although these data structures can be computed in asymptotic Θ(n) time, nevertheless in practice they involve significant overhead, both in time and space. Since the number of repetitions/repeats in x can be repo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005